Approximate diagonalization method for many-fermion Hamiltonians 
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The limits of direct unitary transformation of many-fermion Hamiltonians are explored. Practical 
application of such transformations requires that effective many-body interactions be discarded 
over the course of a calculation. The truncation of the Hamiltonian leads to finite errors and 
in some cases divergences. A new formalism is proposed to manage errors and avoid divergences. 
Removing all interactions from a many-fermion Hamiltonian reduces it to fermion number operators 
allowing for direct calculation of eigenvalues. If the same transformations are applied to the bare 
fermions, eigenfermions are produced whose Slater determinants form eigenstates. This enables a 
hierarchy of diagonalization methods of increasing accuracy as fewer interactions are discarded from 
the Hamiltonian. 
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I. INTRODUCTION 

Simulation of interacting fermions is difficult. The root 
of the difficulty is the large size of the fermion config- 
uration space, which is exponential in the number of 
fermion degrees of freedom. Brute force calculations on 
the large space are only tractable for small model systems 
and small molecules 1 . Approximations are necessary, but 
there is no concensus on an approximate method that 
is sufficiently accurate and efficient for systems of inter- 
est. Development is split mainly between a few pop- 
ular approaches based on well-established ideas: many- 
body perturbation theory with a variety of resummations 
and empirical parameterization 2 , quantum Monte Carlo 
to statistically sample large configuration spaces 3 , and 
methods based on renormalization group (RG) princi- 
ples. The two most popular RG methods are based on 
the extreme limits of one 4 and infinite 5 spatial dimen- 
sions. 

Another, less popular RG approach exists that is based 
on preserving the form of a many-fermion Hamiltonian 
under unitary transformations 6 . With no reference to 
spatial dimension or scale, it is less fundamentally re- 
stricted than other RG approaches. Its primary use to 
date has been in decoupling weakly correlated fermions 
from a system to produce a smaller strongly correlated 
subsystem to be solved using other methods 7,8 . Attempts 
to decouple all fermions with this method have resulted 
in slow convergence and the appearance of numerical 
divergences 7 ' 9 . What remains unclear are the source of 
these problems, and whether they present a fundamental 
barrier to improvement or merely a technical barrier. 

The work presented in this paper addresses the tech- 
nical problems of previous many-fermion transformation 
methods. The fundamental source of error in these meth- 
ods is the truncation of the Hamiltonian after each trans- 
formation, to remove terms outside a prescribed Hamil- 
tonian form. Divergences resulting from truncation can 
be eliminated by conserving a set of quantities that 
are naturally conserved by exact unitary transformation. 



The restrictions placed on operator truncation specify a 
unique form. In a flow equation framework, the conti- 
nous transformation of the Hamiltonian results in con- 
tinuous growth of truncation errors. All previous errors 
get locked into the solution. To minimize the total trun- 
cation error, the continuous transformation is grouped 
into discrete fragments, each of which is carefully op- 
timized based on an error minimization criteria. Some 
finite amount of truncation error inevitably remains and 
can only be reduced further by truncating fewer terms 
from the transformed Hamiltonian. 

Completely decoupling all fermions in a many-fermion 
Hamilonian reduces it to a diagonal form containing only 
fermion number operators. The transformation that di- 
agonalizes a many-fermion Hamiltonian can be applied 
either directly to the Hamiltonian or the elementary 
fermion operators. The transformed fermions are called 
eigenfermions. Eigenstates are Slater determinants of 
eigenfermions. Whereas the eigenvalue decomposition of 
a matrix produces a list of all eigenvalues, the eigen- 
fermion decomposition of a many-fermion Hamiltonian 
produces a function of eigcnfcrmion number operators. 
Eigenvalues are evaluated by replacing number operators 
with occupation numbers. The complete set of eigen- 
states is parameterized by a configuration space of eigen- 
fermion occupation numbers much like the states of a 
classical system lie in a configuration space of classical 
variables. Based on this analogy, the diagonalization 
of a many-fermion Hamiltonian can be intcrprcttcd as 
a quantum-to-classical mapping. 

The paper proceeds as follows. Section II defines and 
discusses the concept of an eigenfermion. Section III con- 
structs a general mathematical theory of truncated uni- 
tary transformations and truncated eigenvalue decompo- 
sition. Section IV applies the general theory to the many- 
fermion case. Section V discusses what can be computed 
as a result of a truncated eigenfermion decomposition and 
the associated computational costs. 
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II. THE EIGENFERMION CONCEPT 

The many-fermion Hamiltonians that describe physi- 
cal systems typically contain only l&2-fermion interac- 
tions. The structure of these Hamiltonians is compactly 
encoded in the standard language of second quantization, 

H = y^hijc}cj + ^2 VijklCiVjCkCi. (1) 

i,3 i,j,k,l 

If H is diagonalizcd in the basis of Slater determinants 
by a unitary transformation U, then the transformed 
Hamiltonian contains only fermion number operators, 

hi = c\ci, 

U^HU = E + J2 E i™i + + ' ' ' • ( 2 ) 

i i,j 

All possible products of n, operators can appear in this 
expression. The l&2-fcrmion form of H in Eq. (1) cannot 
generally be preserved by the diagonalization procedure. 
The eigenvalues and eigenstates of H are parameterized 
by an occupation vector f with entries fa e {0, 1}, 

E(f)=E + J2 E i fi + E E H fifj + -" ( 3a ) 

i i,j 

l*( f )) = ^l f ) = U JJ[(1 - fi) + /»cj]|0). (3b) 

i 

Each |f) is a Slater determinant and |0) is the zcro- 
fermion vacuum state. 

Occupation vectors are convenient mathematical labels 
for eigenstates in Eq. (3), but to have physical signifi- 
cance they must describe the occupations of a physical 
object. This object is a fermion and because of its special 
relationship to eigenstates, it is named an eigenfermion. 
The U that diagonalizcs H can be used to define eigen- 
fermion operators from the bare fermions of the system, 

qi = UdiU^ and rhi^qjqi. (4) 

The anti-commutation relations of q are inherited by q\. 
By rearranging Eq. (2), the original Hamiltonian can be 
written solely in terms of eigenfermion number operators, 

H = E a + ^2 E i™i + X] E lJ rh l m j H . (5) 

* i,j 

The eigenstates can be written as a Slater determinant 
of eigenfermions acting on a new vacuum, |0) = U\0), 

i*(f)>=n[( i -^)+M t ]i0)- (6) 

i 

If U preserves total fermion number, then |0) = |0). It 
is clear from Eqs. (5) and (6) that occupation vectors 
labelling the eigenstates specify eigenfermion occupation 
numbers. 



The only properties that eigenfermions are guaranteed 
to share with the bare fermions of a system are those pre- 
served by symmetry. Such symmetries must be explicitly 
preserved by U in Eq. (4). If total fermion number is 
conserved, then the number of eigenfermions in a state 
corresponds to the number of fermions. If total fermion 
spin is conserved, then eigenfermions will have the same 
well-defined spin as the bare fermions. If translational 
invariance is conserved, then eigenfermions will have a 
well defined crystal momentum. Unless completely con- 
strained by symmetry, the eigenfermions that diagonalize 
a Hamiltonian are non-unique. However, it is possible to 
define a unique set of eigenfermions that are in some sense 
most similar to the bare fermions. 

Practical calculations of Eq. (2) will require approx- 
imations. The generic form of an approximately diago- 
nalized Hamiltonian is 

TpHU = D + R, (7) 

where D contains only fermion number operators and R 
is a residual interaction. The physical effect of R is to 
scatter eigenfermions, reducing |^(f)) from an eigenstate 
to a finite-lifetime nearly-stationary state. This situation 
resembles a theory of fermion quasiparticles that are adi- 
abatically connected to the bare fermions. Such a quasi- 
particle theory can be cast in the form of Eq. (7) with a 1- 
fermion diagonal term, D — J^. Cihi, and R that weakly 
scatters between a ground state |^(fcs)) and certain 
few-quasiparticle excited states |*(f_x)}, {ix\R\^Gs) ~ 0. 
Eigenfermions generalize these fcrmion quasiparticles to 
allow for strong non-scattering interactions in D while 
requiring a weak residual interaction between all states, 
R « 0. If R is sufficiently small, then U exactly diago- 
nalizes a perturbed Hamiltonian, 

U\H + AH)U = D with AH = -URU^. (8) 

Constructing the exact solution to a system slightly dif- 
ferent from what was intended is similar to experimenting 
on impure samples. 

There are many distinct approaches in the 
literature 6-8, 10-12 for approximating U^HU for a 
general I&2-fermion Hamiltonian and a wide class of 
transformations. Unfortunately, none of these methods 
have claimed to generally and reliably solve Eq. (7) 
with a small residual error R. The deficiency is often 
attributed to nearly degenerate states or strong fermion 
correlations. Inadequate closure of equations can result 
in uncontrolled growth of R and numerical divergences 7 . 
No analysis has yet isolated the fundamental source of 
divergences. No a priori criteria has yet to guarantee the 
existence of solutions. Recent progress has focused on 
unitary transformations that only partially diagonalize 
a Hamiltonian, leaving the rest of the problem to other 
many-fermion methods. A lack of progress in full 
diagonalization might be blamed on the complexity of 
many-fermion algebra. It proves productive to step back 
from the many-fermion case and construct a general 
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theory of operator diagonalization in the presence of 
non-negligible truncation errors. 



III. TRUNCATED EIGENVALUE 
DECOMPOSITION THEORY 

The goal of this section is to develop a basic theory 
for the effect of truncation errors on continuous unitary 
transformations. A continuous unitary transformation of 
an operator X is defined by a differential equation of a 
real variable A, 

^X(X) = [A(X),X(X)], (9) 

specified by an anti-Hermitian generator function A(X) 
and starting from X(0) = X. Operator truncation will 
modify this equation. Diagonalization of a Hcrmitian 
operator H is possible if the modified form of Eq. (9) 
can be evolved to a diagonalized form, H(oo) = D. A 
theory should provide a priori conditions that guarantee 
the success of the diagonalization process. Specifically, 
the evolution of H (A) should contain no divergences and 
no stable, non-diagonal fixed points. 

The form of truncation assumed by the theory is that 
X(X) is restricted to a given model subspace of the full 
operator space. There is a corresponding restriction of 
A(X) to a given generator subspace. If the elements of the 
model and generator subspaces form a Lie algebra, then 
no truncations are necessary and an established diago- 
nalization method exists 13 . Otherwise, Eq. (9) cannot 
be calculated exactly and a truncation procedure must 
be defined to close the equation. For the application to 
many-fermion systems, a wide variety of physically moti- 
vated truncation procedures have been proposed. What 
is more suitable for a general theory is a truncation that 
preserves Tr[A" l '(A)A(A)] as a conserved quantity over A. 
This criteria prevents divergences during the truncated 
transformation process and heavily constrains the form 
of truncation. 

The constraint on truncation is clarified by elevating 
the space of operators to a Hilbert space, which requires 
the definition of an operator inner product, 

(X,Y)=Tr[X*Y]. (10) 

An operator inner product naturally defines an operator 
outer product X ® Y, 

{X®Y)Z = X(Y,Z), (11) 

which can be used to construct linear maps between oper- 
ators, commonly referred to as superoperators. Without 
constraint, the most general possible form for truncation 
of Eq. (9) is 

±X M (X)=f(X)[A(X),X M (X)}, (12) 



where 7~(A) is a truncation superoperator acting on the 
result of the commutator. Am (A) is an approximation 
of A (A) restricted to the model subspace, intialized to 
X M (0) = f (0)A. The range of T(A) must be the model 
operator subspace to close the equation. 

The target conserved quantity is now the natural norm 
of the operator Hilbert space, 

||*|| = V / (A,X) = A /Tr(XtX). (13) 
If Eq. (12) is suggestively rewritten as 

^Am(A) - 1(A) A M (A), (14) 

then A(X) must be anti-Hermitian to conserve ||Am(A)||, 

±\\X M (X)\\ 2 = (A M (A), [A(X)+A\X)]X m (X)). (15) 

Since T(A) restricts the range of A(X), anti-Hermicity 
correspondingly restricts the domain. 

Specifying the constrained forms of -4(A) and T(A) re- 
quires the definition of a few more pieces of notation. 
First, the commutator can be written as a superoperator, 
which is anti-Hermitian for an anti-Hermitian argument, 

C[X]Y=[X,Y}. (16) 

Second, a Hermitian projection superoperator is defined 
for the model subspace, 

V M = ^Mi®M u (17) 

i 

where {Mi} is an orthonormal basis of the model sub- 
space. In terms of these new superoperators, the most 
general truncation that conserves ||Am(A)|| is 

f (A) = kV M + V M C[A{X)]K{X) (18a) 
A(X) = f (X)C[A(X)]V M , (18b) 

where k is real and IC(X) is anti-Hermitian. 

Simple error minimization arguments complete the 
specification of T(A) . The residual error of the truncated 
transformation is 

^X R (X) =[X - f(X)][A(X),X M (X)} + [i(A), X R (X)}, 

(19) 

with the identity superoperator I and initial condition 
Xr(0) = X — A"m(0). The second term just rotates the 
residual and is cancelled by considered a prerotated form, 
iffi(A) = U(X)X R (X)Ul(X), defined by 

-^-U(X) = -U(X)A(X) and U(0) = I. (20) 



4 



Assuming the details of Xr(\) are unknown, the best 
strategy for minimizing ||Ar(A)|| is to minimize 

\\dX R2 (X)/dX\\ 2 = \\(1-V M )[A(X),X M (X)}\\ 2 (21) 
+ ll[^M-f(A)][i(A),X M (A)]]|| 2 . 

The growth of error is minimized by a unique choice of 
truncation, T(A) = Vm (k = 1 and tC(X) = 0), which 
cancels the second term. Operator truncation is now de- 
termined solely by the choice of model subspace. 

By combining Eqs. (12), (19), and (20), the exact 
transformation of X can be partitioned into 

W(X)XU(X)=X M (X)+X R (X). (22) 

To conform to the notation in Eq. (7), the truncated 
eigenvalue decomposition of a Hermitian operator H is 
defined as 

U<< (oc) HU (oo) = D + R, (23) 

for a diagonal operator D = Hm(oo) and a residual error 
R = H R (oc). The rest of the section is devoted to the 
theoretical issues of existence and uniqueness of solutions 
and the technical issues of efficient computability and 
error minimization. 

The diagonalization of a Hermitian operator places 
special emphasis on Hermitian and diagonal operators. 
Truncation should preserve both of these properties. 
This can be enforced with restrictions on the model sub- 
space. If X is in the subspace, then X^ must also be 
in the subspace. The model subspace must also be com- 
pletely separable into a purely diagonal and purely off- 
diagonal subspace. Projection superoperators can sepa- 
rate diagonal from off-diagonal ("coupling") operators, 

P D = ^\i){i\®\i){i\ (24a) 

i 

£ C = £N)01®K>01, (24b) 

where {\i)} is an orthonormal basis of states that defines 
diagonality. The model projector can be separated into 
Vm = V M +V M , where V M = V M V D and V M = V M V C . 
Model operators can similarly be split into X M {X) = 
V D X M (X) and Xg(X) = V C X M {X). 

A. Existence and computability 

The most straightforward way to prove the existence of 
a solution to Eq. (23) is to construct one. This simulta- 
neously demonstrates that solutions are computable and 
lays the groundwork for a solution method. Construction 
of a solution is guided by a convergence metric, 

^(A) = \\\H M {X)\\ 2 , (25) 



which approaches zero as Eq. (23) is satisfied. A solution 
exists if it is possible to choose A(X) to monotonically 
reduce 0(A) to zero. This places important constraints 
on the generator subspace that contains A(X). 

It is almost always possible to reduce 0(A) by enforcing 
dn(X)/dX < 0. The first derivative of 0(A) is 

^O(A) = -([H M (X),H M (X)},A(X)). (26) 

To maximize the rate of decrease of 0(A), the two op- 
erators in the inner product should be made as close 
to parallel as possible. This criterion is limited by the 
restriction of A(X) to the generator subspace, which is 
enforced with a projection superoperator defined by an 
orthonormal basis {Gi} of the generator subspace, 

V G = J2 G i® G i- ( 27 ) 

i 

The most parallel A(X) is 

A(X)=V G [H M (X),H M (X)]. (28) 

Without truncation, this procedure produces anti- 
Hermitian, off-diagonal operators. These properties are 
preserved by truncation if the generator subspace is 
similarly restricted to contain only anti-Hcrmitian, off- 
diagonal operators. 

In some cases, diagonalization can get stuck at a non- 
diagonal fixed point with dfl(X)/dX = and 0(A) / 0. 
This occurs when i(A) = and H&(X) ^ in Eq. (28) 
and signifies that H^(X) is contained in the right null 
space of ■P G C[H^(X)]V^ 1 . To further reduce O(A), the 
second derivative must be considered, 

J^(A) = - ([H°(X), H c M (X)l j- x A{X)^ 

+ (P M [^(A),i(A)],[^(A),i(A)]) 

+ ||P M [^(A),i(A)]|| 2 

-\\VZ[H C M W,M\)]\\ 2 - (29) 

Monotonicity requires d 2 Vl(X) / dX 2 < 0, but further re- 
duction of O(A) requires cither d 2 Q(X)/dX 2 < or the 
consideration of even higher derivatives of O(A). 

To guarantee that O(A) can be reduced to zero, the 
effects of the null space of 'PgC[-H^(A)]'P^ have to be 
addressed. If this superoperator's domain is restricted 
to the Hermitian off-diagonal model subspace and the 
range to the generator subspace, then constraining the 
dimensions of the subspaces to be equal is a necessary 
condition for the absence of a null space. If a null space 
remains, then the right null space has a corresponding left 
null space. When ciO(A)/c?A = 0, the second derivative 
of O(A) can be made non-positive by restricting ^4(A) to 
the left null space, 

-4^(A) = -||^[^(A),i(A)]|| 2 . (30) 



■5 



The dCl(X)/dX = fixed point is unstable if the second 
derivative can be made negative and nonzero. A sufficient 
existence criteria is that for any H^ { (X) in the right null 

space of T> G C[H M (X)\P M there must cxist an A W in tne 
left null space such that Eq. (30) is nonzero. 

To demonstrate that the existence criteria is necessary, 
the untruncated case is considered. Without truncation, 
there is analytic eigenvalue decomposition of C[H^(X)], 

c[H°(\)}\i)(j\ = (mtim) (j\H°w\j)) \m- 

(31) 

The off-diagonal eigenoperators naturally come in pairs 
with eigenvalues of opposite sign, {\i){j\, \j)(i\}- When 
(i\H M (X)\i) = (j\H M (X)\j), the eigenoperator pairs are 
degenerate and in the null space of C[H M (X)\. This 
signifies a case where d£l(X)/dX = and S1(A) ^ if 
i?^-(A) oc \i)(j\ + To satisfy the existence criteria 

for this case, the choice A(X) oc — makes Eq. 
(30) nonzero. 

A unique solution is defined by Eq. (28) if the deriva- 
tive of fi(A) remains nonzero until the problem is solved. 
Uniqueness is lost when dQ(X)/dX = because any A(X) 
that satisfies the existence criteria has an arbitrary sign. 
Each choice of sign leads to a different solution. This al- 
most always unique solution minimizes R in Eq. (23) in 
a weak and indirect way. The growth of truncation errors 
in Eq. (21) is proportional to ||A(A)||, and a smaller total 
error results from less transformation. Orthogonal com- 
ponents could be added to Eq. (28) to produce different 
solutions, but this increases ||^4(A)|| and thus truncation 
errors without affecting convergence to a solution as mea- 
sured by d£l(X)/dX. While a more sophisticated theory 
may be possible, this simple choice of solution establishes 
both uniqueness and error minimization for solutions to 
Eq. (23) in a weak but practical form. 



calculating each Ai and minimizing the number required 
to diagonalize H. 

Eq. (32) can be efficiently evaluated as a Taylor series 
cast in a recursive form, 



Zj — -rVhilAi, Zj-i] 

oo 

^ Zj = exp(Ai)X M ,i-i, 

3=0 



(33a) 
(33b) 



with Zq = XM,i-i- If Xm,%~\ is overwritten with the sum 
over Zj , then the total memory requirement of this pro- 
cess is the storage of Xm,i-i, Ai, Zj, and Zj_\. The cal- 
culation is terminated with a desired accuracy is reached, 
as estimated by a posteriori and a priori error bounds, 



j=d+l 



< d\ adi i M) \\Z d \\ (34a) 

" i\\M) d 

<ad(||A||)||AM,,-i||. (34b) 



a,d(x) bounds the error of a finite Taylor series approxi- 
mation of the imaginary exponential, 



a d (x) 



(ix) 



exp(zx) - 2^ — 
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3=0 J 



(35) 



Eq. (34) is derived using the superoperator spectral norm 
and a superoperator function inequality, 



\\f{A 3 )X\\ < \\X\\ max |/(^)|, 
|z|<||A,-|| 

IIA^II 



II All — max 

y \\Y 



(36a) 
(36b) 



B. Efficient solution method 

The continuous minimization of fi(A) can be used to 
construct solutions to Eq. (23), but it is not the most 
efficient method. Numerical evolution of a differential 
equation is required to calculate Hm{X). A{X) has to 
be stored at many A values to transform operators after 
H has been diagonalized. These problems are avoided if 
A(X) is restricted to a piecewise constant function defined 
by a finite set of generators {Ai}. The continuous trans- 
formation in Eq. (12) can be analytically integrated over 
each constant generator to produce a sequence of unitary 
superoperator exponentials, 



X 



M., 



exp(A)* 



M.i 



(32) 



with Xm,o = VmX and Ai = VmCIA^Vm ■ Some many- 
fcrmion methods 8,10 ' 14 use a single exponential for trans- 
formations, but section III A cannot guarantee solutions 
in this case. Efficiency is improved by reducing the cost of 



|| Ai || can be efficiently calculated by restricting Y to the 
model subspace and using the Lanczos method 15 . 

To guarantee the existence of solutions as in section 
III A, a method must calculate the Ai sequentially by 
reducing a discrete analogue of fi(A), 



&i[Ai] = - \\V C eMA l )H M ^-i\ 



(37) 



Direct minimization of rii[^»] requires the calculation of 
an accurate gradient, which is prohibitively expensive for 
large \\Ai\\. The gradient of Oj[ Ai] can be avoided by con- 
structing a cheaper bounding functional, OjL4i] < AjL4i]- 
Minimization of A^L4i] approximately minimizes f2j[Aj]. 
With this strategy, the exponential of Ai only needs to be 
accurately evaluated once per i to calculate Hm,i- The 
tightness of the AjjA] bound will determine the number 
of generators required to solve Eq. (23). Specifically, if 
Ai[Ai] -Qi[Ai] oc || Ai || d for small ||A||, then the asymp- 
totic convergence will be f2jL4i] oc ^lf_ x [A-i]- The only 
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limitation on the choice of Aj[j4j] is that it must match 
fiiL4j] up to second order in Ai to correctly treat the 
cases in section III A where cPfl(\)/d\ 2 is required to 
calculate A(X). 

Ultimately, the choice of Aj [Ai] must be guided by the 
specific details of a particular application and computing 
environment. Some examples of bounds on Oj[j4j] are 



\l^i[Ai] < 



.7=0 



3- 



+ a d (\\Ai\\)\\H M ,i-i\\, 
(38) 

derived by splitting the exponential Taylor series with 
the triangle inequality and applying Eq. (36a). If one of 
these bounds is minimized, a<j(||.4i||) effectively acts as a 
Lagrange multiplier to constrain the size of || Ai || and thus 
reduce unnecessary transformations and their associated 
truncation errors. Whatever the choice of AjLAj], stan- 
dard methods can be used for functional minimization. 
Convergence of the minimization procedure will depend 
on the condition number of the Hessian matrix and the 
ability to effectively precondition the gradient. 



IV. APPLICATION TO FERMIONS 

Several ingredients are required to apply the theory 
developed in section III to many-fermion systems. First, 
new notation is introduced to simplify many-fermion op- 
erator algebra. Second, model and generator subspaces 
are chosen to satisfy the existence criteria in section III A. 
Third, a bounding functional Aj[A,] and its gradient pre- 
conditioner are specified to satisfy the requirements in 
section III B. 

A few basic but non-standard occupation vector oper- 
ations are used throughout this section. Vector-valued 
operations are modular addition and three set-theoretic 
operations, defined by their components as 

[feg]; = + mod 2 (39a) 

[fng|i = /tfi (39b) 

[f\g]i = fi-fi9i (39c) 
[fUg]i = fi + gi- fi9i- (39d) 

An occupation vector norm is defined as ||x|| = J2i \ x i\- 
Also useful is a fermion sign function, 

s(f,g) = (-l)^i(^^<iSi), (40) 

which obeys several useful identities, 

s(f,g)s(h,g) = s(f 8h,g) (41a) 

s(f,g)s(f,h) = s(f,geh) (41b) 

S (f,f) = (-l)Lll f ll/ 2 J (41c) 

S (f,g) S (g,f) = (-l) ,if|ll|g|hf ' g , (41d) 

and accounts for all sign changes resulting from fermion 
anti-commutations. 



Operator algebra is simplified by indexing operator ba- 
sis elements with pairs of occupation vectors. A simple 
example of such an operator is an outer product of Slater 
determinants, |f)(g|. However, physical many-fermion 
operators such as in Eq. (1) are compactly represented 
with products of elementary fermion operators and not 
Slater determinant outer products. Two types of opera- 
tor basis elements of this form are considered, 

B(i,g) =iLll f ©gll/ 2 J mod2 

x IlK 1 - &) + (!- 2n j )/ J -«fe 

3 

+ {^ + c 3 ).f 3 (l-g 3 ) 

+ i(c]-c j )(l -fj) gj ], (42a) 

C(f,g) =[][(! - - 9i) + (1 - 2hj)f jgj 

3 

+ c]f j (l-g j ) + c j (l-f j )g j ], (42b) 

each with distinct advantages and disadvantages. 

B(f, g) and C(f, g) share several basic properties. 
They are both off-diagonal for f ^ g and diagonal for 
f = g. Also, B(i,{) = C(i, f). Each set of elements is 
trace-orthogonal and both have simple normalizations, 



\B(f,s) 



2" and ||C(f,g) 



>n-||feg|| 



(43) 



where n is the total number of fermion degrees of free- 
dom. Operators can be decomposed in cither basis, 



X = 2-"^Tr[B(f,g)A]B(f,g) 



(44a) 



X = 2~ n J2 2 l|f0s,l Tr[C* t (f, g)i"]C(f, g). (44b) 
f,g 

Basis transformations can be calculated by representing 
the elements of one basis in another basis, 

C(f,g) =i-Lll fffi sll/ 2 J mod 2+||g\f|| 2 -||f®g|| 

x ^ (fffig) ' h S(f ©h,g©h) (45a) 

h\(feg)=o 

B(f,g) ^LH f ®sll/ 2 J mod2-||g\f|| 

x (-l) g ' h C(f®h,g©h). (45b) 

h\(feg)=o 

Further properties of the operators deviate. 

The B(f, g) operators are Hermitian and unitary and 
have simple algebraic properties. The action of _B(f,g) 
on a Slater determinant produces another Slater deter- 
minant with an i m phase factor, 



B(f,g)|h) =0(f,g,h)|f 0g 0h) 
0(f,g,h)=(-l)s- h s (fffig,h) 



x i 



||fffig||/2j mod 2+||g\f|! 



(46a) 
(46b) 
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The product of two £?(f, g) basis operators is a single 
other basis operator with another i m phase factor, 

B(f , g)B(h, k) =0(f , g, h, k)B(f © h, g © k) (47a) 
0(f , g, h, k) =s(f © g, h © k )(_i)(s\f)-(h\k) 
x ^_ 1 \(fng)-(k\h)+(f\ g ).(hnk) 
x j[(feg)-(h©k)-||f© g ||||hek||] mod 2 
x jll(fnk)©(gnh)||_ ^ 47b ^ 

The commutator formula is equally simple, 

[B(f, g),B(h, k)] =2*Im0(f , g, h, k)B(f © h, g © k). 

(48) 

The algebra of the B(f, g) operators bears similarities 
to the Pauli matrices and they might be considered as a 
many-fermion analogue. 

The important advantage of C(f, g) over B (i , g) is its 
ability to exploit 1-fermion symmetries. These symme- 
tries arise from 1-fcrmion invariant operators that com- 
mute with a Hamiltonian, [S,H] = 0. Without loss of 
generality, S is Hermitian. The vanishing commutator 
allows S and H to be simultaneously diagonalized. The 
fcrmion operators can be chosen to diagonalize S, result- 
ing in the simple commutation relations 

[S,c l ]^s l c l . (49) 

Examples of common 1-fermion invariant operators are 
total fcrmion number, total fermion spin, lattice vector 



translations, and point group operations. The C(f , g) 
operators retain the simple commutation relations, 

[S,C(f,g)}=C(f, g )J2s i ( 9i -f i ). (50) 

i 

Only C(f , g) operators that commute with S are required 
to represent H and the symmetry-preserving generators 
A that diagonalize it. 

The disadvantage of the C(f, g) operators is their more 
complicated operator algebra. They are not Hermitian 
and are related to their Hermitian conjugates by 

(7t(f,g) = (-l)LHf©sll/2Jc(g,f). (51) 

The action of C*(f,g) on Fock states can now produce 
zero, 

C(f,g)|h) = w (h-g + f)(-l)( fn s)- h 

x s(f ffig,h)|f ffigffih), (52) 

encoded in an occupation vector validity function, 

vM = r i, xe {o,i}™ (53) 

^ ' |^0, otherwise ' ^ ' 

The product of two basis operators is no longer always a 
single basis operator, 



C(f,g)C(h,k) =v(f 



with x 
and y 



g + h 

E 



k)s(f e g, h e k)(-i)( fffik )-(s nh )2-( f ®s)-(hek) 

2^ (-l)ll"\(8n h )ll6(xez,yez) 
z\[(feg)n(hek)]=o 

[(f \ g) \ (k \ h)] © [(h \ k) \ (g \ f )] © [(f n g) \ (h u k)] e [(h n k) \ (f u g)] 
[(g \ f) \ (h \ k)] © [(k \ h) \ (f \ g )] © [(f n g) \ (h u k)] © [(h n k) \ (f u g)] 



(54) 



r 



In this arrangement, C(x, y) is the contribution with the 
smallest value of ||(x © z) U (y © z)||. The commutator 
formula is just two applications of the product formula 
and does not further simplify except to cancel some terms 
in the sum over z and vanish when 



(f © g) • (h © k) = and 
[(f © k) ■ (g n h) + (g © h) • (f n k) 
+ ||f©g||||hffik||l 



(55) 



mod 2 = 0. 



A. Model and generator subspaces 



The model subspace of a many-fermion system should 
contain all operators necessary for an accurate physical 
description of the Hamiltonian as it is transformed to a 
diagonal form. It is generally observed that basis opera- 
tors containing fewer elementary fermion operators have 
more physical importance. In systems where the physics 
is geometrically local, geometric constraints may also de- 
termine the importance of basis operators. Careful study 
might reveal further crucial system-specific sets of basis 
operators, distinct from either general criterion. To allow 
for all these possibilities, the model subspace is defined 
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by an allowed set of occupation vectors V as 

{B(f , g):fu g ey}, feV=^f\xey. (56) 

B(i , g) and C(f , g) are interchangeable in this definition. 
The constraints on V are minimal and it is straightfor- 
ward to expand any set to satisfy them. Operators in the 
span of this basis are defined to be V -sparse. 

The generator subspace is chosen to contain all off- 
diagonal and anti-Hcrmitian operators in the model sub- 
space. In terms of B(f, g), this means basis elements 
of the form iB(f, g), f ^ g. In terms of C(f, g), this 
means basis elements of the form C(f, g) — C T (f, g) for 
a real operator space. For a complex operator space, the 
subspace must also include i[C(f, g) + C'(f, g)], f / g. 
This choice minimizes the truncation errors in a weak, 
"greedy" way. For a given f in V, if a Hamiltonian con- 
tains only terms of the form B(g, h) with (g U h) \ f = 0, 
then diagonalization can be performed exactly. 

The use of B{£ , g) or C{f , g) as both an operator basis 
and subspace basis simplifies operator truncation. Since 
the basis elements are trace-orthogonal, projections us- 
ing Vm or Tq just discard elements not in the subspace. 
This can be physically interpretted as a form of normal 
ordering based truncation. The standard normal order- 
ing rules for a reference Slater determinant |z) arrange 
all number operators into the form n m — z m as in C z (f , g) 
defined below in Eq. (65). If the reference is an ensem- 
ble of Slater determinants with statistically uncorrelated 
occupations, z m G [0,1], the standard normal ordering 
rules still apply. The C (f , g) operators and truncation 
derived in section III correspond to the infinite tempera- 
ture thermal ensemble, z m = 1/2, that equally weights all 
fermion configurations. This prevents the physics from 
being biased by a choice of reference state. For a trunca- 
tion process meant to approximate the entire spectrum 
of a many-fermion system, it is unreasonable to expect 
any one reference state to be suitable for the description 
of all eigenstates. 

The suggested choice of model and generator subspaces 
defined by l^-sparsity satisfies all the criteria established 
in section III A to guarantee Hamiltonian diagonaliza- 
tion. The only criterion that requires discussion is the 
existence of non-zero values for Eq. (30). The analy- 
sis is simplest in the C (f , g) basis because the truncated 
diagonal commutator VgCIHm^m preserves the vector 
f — g. This property can be exploited by writing the 
Hamiltonian in the form 



H 



M 



H M + 



fg=0 



C(f,g)£>(f,g), 



(57) 



where D{i , g) are non-zero diagonal Hermitian operators 
that commute with C(f, g). If Hm is in the right null 
space of VgC{Hm]Vm, then each non-zero term of the 
form C(f, g)Z)(f, g) is also in the null space. The gener- 
ator A can be chosen as any anti-Hermitian combination 



of these null operators. If the generator is chosen to be 
[C(f, g) — (7 T (f , g)]Z)(f, g) then the existence condition 
reduces to 



|^ M ^ 2 (f,g)[C(f,g),C(g,f)]|| ^0. 



(58) 



The left hand side of Eq. (58) can be bounded from below 
by replacing £> 2 (f, g) by its trace. For Z)(f,g) ^ 0, the 
trace of D 2 (f , g) is non-zero and can be ignored. The 
remaining commutator is unaffected by the truncation 
and can be explicitly calculated as 

\\[C(f,g),C(g,f)]\\ =2-ll f ®*V2"+ 1 &(||f©g||) (59a) 



L(*-i)/2J 

m= E 



jt (i-2i-l)!(2j + l)!' 



(59b) 



The commutator always has a non-zero norm, which es- 
tablishes that Eq. (58) is satisfied. 



B. Bounding functional and preconditioner 

The bounding functional is chosen to match the form 
of Eq. (38) for d = 2, which is the simplest allowed 
functional of that form. With this choice, the asymptotic 
convergence is H-f^Mill °^ II^Mi-ill 3 - The functional can 
be written suggestively as 



=2[II^Af,i-i + ^[^.^,i-i] 

+ a 2 (a\\A t \\)\\HM^-i\\} 2 



(60a) 



H' M i _ 1 =Hm,z-i + -^Pm\Ai, Hm,i-i] (60b) 



a =\\Ai 



(60c) 



This form enables the calculation of \\Ai\\ to be weakly 
coupled to the minimization of AjL4j] if a has a weak 
dependence on Ai. 

A preconditioner is constructed by approximating the 
inverse Hessian of AjL4j]- This is straightforward in the 
untruncated case. When diagonalization is converged, 
the Hessian reduces to C[D] 2 . A preconditioner that is 
exact in this limit is 



f¥g 



|f)(g|®|f)(g| 



A(f,g) ' 

A(f,g) =((f|# Mli -i|f> - (g|^-i|g)) 2 
+ 4\({\H M , l -i\g)\ 2 +(3. 



(61a) 



(61b) 



This form includes approximate off-diagonal corrections 
using the quadratic formula and an extra uniform shift 
/?. The shift acts as either a tuning parameter to adjust 
the size of the preconditioned gradient or to approximate 
the effects of 02(a||Ai||) on the Hessian when ||^4j|| gets 
large. (WIP) 
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There is no natural extension of Eq. (61a) to the trun- 
cated case using the B(i, g) or C(f, g) basis elements. 
However, this preconditioner can be approximated by 
defining a basis E y (f , g) of the V^-sparse subspace that 
mimics the Slater determinant outer products, 

(f © z|^(h, k)|g © z) = 5[\\i - h||]«[||g - k||] (62) 

for f U g € V and h U k 6 V. S[i] is the Kronecker delta 
function (6[0] = l,8[i ^ 0] = 0). These operators are 
defined with respect to a reference Slater determinant 
z). A ^-sparse operator X v can be decomposed in this 
basis using its matrix elements, 



fugev 



If ffiz|AV|g©z)^(f,g). (63) 



The analogue of Eq. (61a) using E v (f, g) is 
E%(f,g)® |f ©z)(g©z| 



fugev 



A(f ffiz,gffiz) 



(64) 



The choice of z is arbitrary, but it should not have a 
strong effect on the quality of the preconditioner. 

An efficient explicit construction of Ey (f , g) requires 
the definition of an intermediate operator basis, 



C z (f,g) =s(f ©g,gffiz) 

xn{a-/i)(i-5i) 



(65) 



+ mfigi + (1 - 2n t )z l f l g l 
+ c\[f i {l-g i ) + z i {g i - U)] 
+ Cj[(l - fi)9i + Ziifi -9i)]}- 

This is a variant of C(f , g) that is normal ordered with 
respect to |z). E v (f, g) can be constructed by writing 
fffiz)(gffiz in terms of (7 z (f,g) and projecting into the 
V-sparse subspace, 

£^(f,g)= Yl (-l) l|h "6 z (f ©h,geh) (66a) 
(fug)ehev 

(fUg)-h=0 

C z (f , g) - ]T *(f © g, h)E v (f © h, g © h). 



(fug)ehey 
(fu g )-h=o 



(66b) 



The inverse transformation is calculated using Eq. (63). 
The transformations between E v (f, g) and (7 z (f,g) are 
independent of z. 

A few remaining formulas are required to transform 
between the B(f,g) basis and E v (f, g) basis. The miss- 
ing intermediate steps are the transformations between 



C(f,g) and (7 z (f,g), 

C(f,g) =(-l)( fn s)' z s (f ©g,gffiz) 

x J2 (-2) l|h "s(f ©g,h) 



(67a) 



h\(fn g )=o 

x(7 z (f'ffih,g'ffih) 
C z (t , g) =2- rs s(f © g, g © z) (67b) 
x J2 (-l)^C(f'®h,g'®h), 



h\(fn g )=o 

with f =(f \ z) © (g n z) © (f n g) 
and g' =(g \ z) © (f n z) © (f n g). 



(67c) 
(67d) 



The complete transformation is performed as a sequence 
of three intermediate steps: B o C using Eq. (45), 
C «• C* z using Eq. (67), and C z ^ using Eq. (66). 



V. TRUNCATED EIGENFERMION 
DECOMPOSITION 

Many-fermion methods, especially those in quantum 
chemistry, are often arranged as a systematic hierarchy 
of increasing cost and accuracy. While it is possible to 
construct a TED for a flexible choice of operator basis 
limited only by Eq. (56), this section considers only a 
specific hierarchy of methods. The methods are referred 
to as TEDr for an integer r and defined by the model 
operator subspace 



{C(f,g) : ||fUg|| <r}. 



(68) 



TEDr is exact for r = n and accuracy should systemat- 
ically improve with increasing r. Computational scaling 
of the methods depend on r and the total number of 
fermion degrees of freedom n. The memory required to 
store each operator scales as 0(n r ). Commutation of op- 
erators is the most computationally expensive step of the 
TED and scales as 0(n'- 1 - 5r J) operations. 0(n r ) memory 
and 0(nL 15r J) operations are taken to be the computa- 
tional budget of TEDr for calculating physical properties 
following the diagonalization of a Hamiltonian. 

A Hamiltonian diagonalized by TEDr as in Eq. (7) 
has the form 



D= J2 d(f)d(f,f). 



(69) 



HflKr 



An eigenvalue E(z) can be calculated with this formula 
by replacing hi with Zj. This calculation scales as 0(n r ) 
operations. Only 0(nL 1 - 5r J/ r ) eigenvalues can be calcu- 
lated this way before the computational budget is ex- 
hausted. A specialized alternative is to transform D into 
the E v (f, f) operator basis, 



D= J2 E(t(Bz)Ey({,t). 



(70) 



||f||<r 
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This simultaneously calculates 0(n r ) eigenvalues corre- 
sponding to few-eigenfcrmion excitations from a reference 
eigenstate |*(z)). Each of these transformations costs 
0(n r ) operations, which increases the total number of 
computable eigenvalues to 0(rjL 15r J). 

It is in the study of energetics that the TED can be 
considered a quantum-to-classical mapping. All energies 
come from E(f) in Eq. (3a), which for TEDr has a 
form that resembles the total energy cluster expansions 
used in the study of alloys 16 . The complexity of finding 
the ground state is dramatically reduced from the initial 
Hamiltonian, 



from min V 1 ' / to min£(f). (71) 
|*) (* *) f 



the transformed operator in the Ey(f, g) operator basis, 



&X'U = <*(f©z)l*'l*(g©z)>^(f,g), (72) 

HfUelKr 



l|fug||< 

a set of approximate matrix elements close to a reference 
eigenstate |*(z)) can be efficiently computed. 

The calculation of reduced density matrices is 
an established application of truncated unitary 
transformations 12 . They can be calculated as a 
subset of (vf'(z)IC'(f, g)|*(z)) for a chosen eigenstate 
and all elements of the model subspace. To close this 
calculation, the infinitesimal transformation of a matrix 
element must be related back to the untransformed 
matrix elements. The key step is performing the trans- 
formation backwards by defining a new unitary operator 
V(X) that evolves as 



A minimization over 2™ complex numbers is reduced 
to n binary choices. Complexity theory still classifies 
both problems as hard, QMA-complete for the quantum 
problem 17 and NP-complete for the classically-mapped 
problem 18 . Practically, many physical systems of interest 
are unfrustrated and will result in easy instances of min- 
imization over E(i). Even in hard cases, simple heuris- 
tics can give good results. Eq. (70) can be calculated 
for random sets of z and energy-lowering few-fermion ex- 
citations can be successively applied. In an easy prob- 
lem, most or all initial configurations will relax to the 
ground state. A hard "glassy" problem will have many 
distinct local minima in configuration space. In "exotic" 
systems where the low-energy excitations are not few- 
eigenfermion excitations, this procedure might help to 
map out the energy landscape. 

To calculate eigenstate matrix elements of an opera- 
tor X, a truncated unitary transformation is performed. 
This exactly calculates WX'U for a perturbed operator 
X 1 = X - UXrU^ as in Eq. (22). The transformation 
costs 0(nL 15r J) operations, which exceeds the compu- 
tational budget if more than 0(1) operators are calcu- 
lated in this manner. Calculations of this nature are 
able to produce a large amount of spectral information 
for a small number of operators. This might be use- 
ful for categorizing eigenstates and transitions based on 
a small number of important observables. By rewriting 



dX 



V(X) = —A(X F — X)V(X), 



(73) 



with initial condition V(Q) = I. The entire transforma- 
tion that defines U must occur in the interval [0,Af], 
resulting in V(Xp) = U. Using the same truncation as 
in the transformation of operators, the truncated trans- 
formation of matrix elements is defined as 



^(f,g)>(A) 



(V M [A(X F -X),C(f,g)])(X), (74) 



with ((7(f,g))(A) ss (z|t>t(A)C'(f,g)V r (A)|z). For a piece- 
wise constant generator, the transformation can be eval- 
uated as a sequence of superoperator exponentials, 

(C(f,g))i = (exp(^ m _ 4+1 )C(f,g)) 4 -i, (75) 

where m is the number of generator segments. As with 
operator transformations, the cost of this calculation 
scales as 0(n L 1 - 5r J) operations. 
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